added multicolumn aggregation to DBA and improved three essential parts which suffer from many chunks #1069

s3inlc · 2024-06-20T12:08:28Z

This PR solves some of the issues with loading/handling of systems with large amount of tasks or larger number of agents:

When creating chunks, so far there was a global lock file in place which was locked no matter for which task a chunk was about to be created. It is enough to lock by each task, as creating two chunks for two different tasks at the same time poses no problem.
When having active tasks with very large amount of chunks, the chunk assignment on the server may take so long, that the agent thinks the connection has timed out (30s). This can lead to situations where all agents end up in the loop of requesting chunks/tasks and the server is not able to respond fast enough anymore due to an active task it has to check where it looped over all chunks (which obviously took time the larger the number of chunks). This is solved by the changes in TaskUtils where instead of looping over all existing chunks, it is only looping over the non-completed ones and uses SUM in sql to determine the other needed values.
In general there are a lot of places in the code, where multiple queries are done summing/counting/maxing over columns (or even worse, in most of the places, the entries are loaded and looped over in the code). When having a lot of chunks, the biggest issue happened in the getTaskInfo() function, which is used by the current old UI, causing very long loading times of the tasks page if tasks with many chunks were listed. In order to tackle this, the DBA was extended slightly to be able to sum/count/max/.. over multiple columns with the same query (as long as the where conditions were the same for all them). In the specific case of the getTaskInfo() function, this reduced the loading times by 5-6 times approximately.

The DBA function change was already adapted for the getChunkInfo() function as well, but there may be many places still around where potentially using either a aggregation function alone or using the newly added multicolumn aggregation could give potential additional speedups.

… then 1024 bytes

Adding loops to scan through lines to support importing hashes longer then 1024 bytes

…ts which suffer from many chunks

src/inc/Util.class.php

…#1069

zyronix and others added 3 commits April 22, 2024 10:48

Adding loops to scan through lines to support importing hashes longer…

125e283

… then 1024 bytes

Merge pull request #1062 from hashtopolis/bug/1061

087bd05

Adding loops to scan through lines to support importing hashes longer then 1024 bytes

added multicolumn aggregation to DBA and improved three essential par…

92fcc6e

…ts which suffer from many chunks

s3inlc requested a review from zyronix June 20, 2024 12:08

s3inlc changed the base branch from master to dev September 27, 2024 13:46

s3inlc requested review from jessevz and removed request for zyronix September 27, 2024 13:47

jessevz reviewed Oct 14, 2024

View reviewed changes

src/inc/Util.class.php Outdated Show resolved Hide resolved

src/inc/Util.class.php Outdated Show resolved Hide resolved

jessevz self-assigned this Oct 14, 2024

ObsidianOracle assigned s3inlc Nov 6, 2024

applied requested changes (constants and query merge) for pull request …

27fecc8

…#1069

s3inlc requested a review from jessevz November 20, 2024 10:41

jessevz merged commit fe16f39 into dev Nov 21, 2024
1 check passed

jessevz mentioned this pull request Nov 28, 2024

[ENHANCEMENT]: cleanup lock files #1139

Closed

s3inlc deleted the aggregation-optimization branch July 31, 2025 06:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added multicolumn aggregation to DBA and improved three essential parts which suffer from many chunks #1069

added multicolumn aggregation to DBA and improved three essential parts which suffer from many chunks #1069

Uh oh!

s3inlc commented Jun 20, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

added multicolumn aggregation to DBA and improved three essential parts which suffer from many chunks #1069

added multicolumn aggregation to DBA and improved three essential parts which suffer from many chunks #1069

Uh oh!

Conversation

s3inlc commented Jun 20, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants